The morning session is based on the assumption that you bring your own laptop on which you pre-install some software and download the data with which we will work. This page summarises everything you need to know to prepare for the workshop.
Both morning workshops require that you download a total of four files: a corpus sample, a word vector model and two software packages. The software in question does not require installation and can therefore be used without an administrator account on the computer. They work on any operating system.
During the entire morning session, we’ll work with a subset of the EC archive corpus, which can be downloaded here:
Once the five files downloaded and unzipped, we recommend to put them in a folder named "workshop" somewhere on your desktop.
Detailed instructions will be handed out during the workshop for each step of the process, so apart from downloading the softwares and the test data, no other preparation is required.
Support: if you have trouble downloading the softwares and test data, please send a tweet or an email to @sethvanhooland.
The tutorials do not require any prior knowledge of computer science or programming.
For some of the software tools we will be using the command line, but all of the specific commands will be given and explained.
However, in order to save time, it would be important if you at least know how to open a terminal from a specific folder. In Windows or Linux, the operation is very simple.
If you have a Mac, please enable the "New terminal at folder" option. To do this, go to "System Preferences", then "Keyboard", then "Shortcuts", then "Services", and check the box "Enable New Terminal at Folder". This operation will save you a lot of time in the future.
In order for the workshop to run smoothly and to ensure a common base for the group discussions, we are already making available some results based on the test corpus:
The three files are gathered in a ZIP archive downloadable HERE.